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DUAL MODE DATA COMPRESSION TECHNIQUE 

Related applications 

This application is related to U.S. Application Serial 

No. .(Attorney Docket: 3175-51), entitled "Enhanced Data 

5 Compression Technique" and filed concurrently herewith on December 
29, 2000. 

Technical Field 

The present application relates generally to data compression 
and more particularly to an enhanced data compression technique 
10 particularly suitable for use in the graphical arts for compressing 
large images. 

Background Art 

In the graphic arts there is a tendency to have extremely 
large, one-bit-per-sample images approaching or even exceeding 2 

15 gigabytes of data. The need to compress such data has been well 
known for many years. 

One proposed technique for compressing such data is commonly 
referred to as a pack-bit (PB) compression technique. Using 
proposed PB cpmpression techniques, either a string of characters 

20 is preceded with a count and a repeat character code or a single 
byte pattern is preceded with a count. Proposed PB compression 
techniques are capable of processing data very quickly. These 
techniques also provide satisfactory results if the data is either 
solid black or solid white, and hence digitally represented in 

25 binary form by all l's or 0's. Accordingly, PB techniques provide 
reasonably satisfactory results for non-color image data. 

An exemplary pack-bits representation of a stream of 
sequential input data, as it would appear entering a processor 
prior to encoding, might include the string of characters 

1 
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"abcOOOOOOOOOO" . Using the PB technique, the processor would first 
determined whether or not the first character "a" and the second 
character "b" match. Under some proposed PB techniques , the 
processor might scan ahead to consider other matches in certain of 
5 the subsequent characters. In any event , since in the present 
example the determination is negative, the processor proceeds to 
encode the input data as a literal string with a length. The 
processor next determines whether or not the second character "b" 
and the third character "c" match. Since this determination is 

10 also negative, the processor will proceed to encode the three 
characters of the input data as a literal string with a length." The 
processor now determines whether or not the third character xx c" and 
the fourth character "0" match. Since this determination is also 
negative, the processor will proceed to encode the four characters 

15 of the input data as a literal string with a length. The processor 
continues by determining whether or not the fourth character "0" 
and the fifth character "0" match. Since this determination is 
positive, the processor continues by determining whether or not the 
immediately subsequent characters in the sequence also match, until 

20 it makes a negative determination. The processor thereby 
determines the repeat count for the character "0". Based on the 
initial positive determination, the processor also proceeds to 
encode the first three characters of the input data sequence, i.e. 
"a", "b" and "c", as a literal string with a length and the 

25 following 10 characters of the input data sequence, i.e. the 
"0"..."0", as a repeat character with a count. 

Accordingly, the processor generates encoded output data 
forming a 2-byte sequence including the strings of characters 
"82abc" and "090". In the output data, the "8" serves as a header 

30 and indicates that the total length of the sequence is 8 bits and a 
literal string follows, the "2" indicates that the length of the 
literal string is three characters, i.e. characters "a", "b", and 
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"c", the first "0" indicates a repeat character follows, and the 
"9" indicates that the repeat character represented by the second 
"0" is repeated 10 times. 

To decode the encoded sequence "82abc090", the receiving 
5 processor first reads the new header "8", which is the highest 
order bit, and from the header determines that a literal string 
follows. The processor then extracts the length "2" and reads the 
next three characters "a", "b" and "c". The processor next reads 
the first "0" and from this determines that a repeat character 

10 follows. The processor continues by extracting the count "9" and 
reading the next character "0", which is the character to be 
repeated 10 times. It will be recognized that by using one-off 
numbers such as the "2" to indicate a literal string of 3 
characters and the M 9" to indicate that a repeat character is 

15 repeated 10 times, a close to 1% improvement is obtainable because 
128 bytes can be packed into 129. 

As should be clear from the above, PB techniques process only 
one character at a time. Accordingly, PB techniques are incapable 
of compressing strings of repeating multiple byte patterns. PB 

20 techniques also have a relatively limited compression rate, 
generally no more than 64 to 1. Thus, PB compression techniques 
provide unsatisfactory results when used to compress color image 
data . 



25 commonly referred to as the Lempel-Ziv-Welch (LZW) compression 
technique. Using proposed LZW compression techniques variable 
length of strings of byte based data can be processed. Proposed LZW 
compression technique, process the data somewhat slower than PB 
compression techniques, but provide satisfactory results on data 

30 representing color images as well as black and white images. 
However, since these techniques are based on single bytes of data, 
such techniques are incapable of compressing data on an arbitrary 



Another proposed technique for compressing image data is 
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pixel or bit boundary basis. Additionally, although such 
techniques, are capable of providing a higher compression rate than 
PB compression techniques, LZW techniques still offer a somewhat 
limited compression rate. 
5 An exemplary LZW representation of a stream of sequential 

input data, as it would appear entering a processor prior to 
encoding, might include the string of characters "abcOlcOl". Using 
the LZW technique, the encoding and decoding processors must 
coordinate on the transmission and receipt of codes. LZW 

10 techniques use a compression dictionary containing some limited 
number of compression codes defined during the processing of the 
input data. The characters in the input string are read on a 
character by character basis to determine if a sub-string of 
characters match a compression code defined during the processing 

15 of prior characters in the input string. If so, the matching sub- 
string of characters are encoded with the applicable compression 
code. If a sub-string of characters does not match a pre-existing 
code, a new code corresponding to the sub-string is added to the 
dictionary. Sub-strings are initially defined by codes having 9 

20 bits or digits, but the number of bits may be increased up to 12 
bits to add new codes. Once the 12 bit limit is exceeded, the 
dictionary is reset and subsequent codes are again defined 
initially with 9 bits. In conventional LZW techniques, two codes 
are predefined, i.e. defined prior to initiating processing of the 

25 input string. In the present example these codes are the code 100, 
representing a reset, and the code 101, representing an end. In 
the present example, codes 102, 103, and 104 etc. represent strings 
of new patterns which are identified during the processing of the 
input data. 

30 Using the LZW technique, the encoding processor would first 

read the "a" in the sequence and the "b" immediately thereafter in 
the sequence. The processor then determines if a code exists for 
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the character sequence "ab". Since, in this example, no such code 
exists at this point in the processing, a new code 103 is generated 
to represent the new pattern string "ab". The processor continues 
by reading the "c" immediately following the "b" in the sequence. 
5 The processor determines if a code exists for the character 
sequence "be". Since, in this example, no such code exists at this 
point in the processing, a new code 104 is generated to represent 
the new pattern string "be". 

The processor continues by reading the "0" immediately 

10 following the "c" in the sequence. The processor determines if a 
code exists for the character sequence "cO". Since, in this 
example, no such code exists at this point in the processing, a new 
code 105 is generated to represent the new pattern string "cO". The 
processor continues further by reading the "1" immediately 

15 following the "0" in the sequence. The processor determines if a 
code exists for the character sequence "01". Since, in this 
example, no such code exist at this point in the processing, a new 
code 106 is generated to represent the new pattern string "01". The 
processor proceeds by reading the "c" immediately following the "1" 

20 in the sequence. The processor determines if a code exists for the 
character sequence "lc". Since, in this example, no such code 
exists at this point in the processing, a new code 107 is generated 
to represent the new pattern string "lc". 

The processor proceeds by reading the "0" immediately 

25 following the second "c" in the sequence. The processor determines 
if a code exists for the character sequence "c0". In this example, 
such a code, i.e. code 105, does exist. The processor therefore 
proceeds by reading the "1" immediately following the second "cO" 
in the sequence. The processor determines if a code exists for the 

30 character sequence "cOl". Since, in this example, such a code does 
not exist, a new code 108 is generated to represent the new pattern 
string "cOl" which can be represented as "1051". The processor 
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ultimately generates encoded output data forming a sequence 
including the string of characters "100abc01cl05" . 

Using the LZW technique, the encoding processor builds a tree 
of codes generated using other codes. This is a primary reason why 
5 the LZW techniques provide satisfactory results even though 
processing is performed on a byte by byte basis to find repeating 
bytes. That is, the downstream encoding builds on the upstream 
encoding. However, using the LZW technique, the encoding processor 
can take significant processing time to encode large sequences. 

10 For example, if there is a large, say a megabyte, occurrence of 
adjacent 0's or l's, a significant period of time will be required 
by the processor to encode the sequence. 

The decoding processor builds a similar tree from the codes 
received from the encoding processor. Basically, the decoding 

15 processor performs the reciprocal of the encoding process to decode 
the encoded sequence characters "100abc011051" . 

In summary, the PB compression technique is deficient in that 
it addresses only single byte repeats and is limited to a 64 to 1 
compression rate. Therefore, it is not suitable for color images. 

20 On the other hand, the LZW compression technique addresses multi- 
byte repeats and has a compression rate of perhaps 500 to 1, but 
requires significant processing time to build the codes which are 
required to obtain good compression. Hence, although the LZW 
technique may be suitable where relatively small amounts of data 

25 are involved, where the encoding of gigabytes of data is required, 
such as with an 80 inch x 50 inch image having 2400 dots per inch, 
the processing time and/or resources to encode data using the LZW 
technique make the technique impractical. 



30 compress large amounts of image data, offer a still higher 
compression rate than previously proposed techniques, and provide 
satisfactory results when used to compress either color or non- 



Accordingly, a need exist for a technique which can quickly 
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color image data. 

Objectives of the Invention 

It is an object of the present invention to provide a 

technique for quickly compressing large amounts of image data. 
5 It is a further object of the present invention to provide a 

technique which facilitates high compression rates for either color 

or non-color image data. 

It is yet another object of the present invention to provide a 

technique which gives satisfactory results when used to compress 
10 either color or non-color image data. 

Additional objects, advantages, novel features of the present 
□ invention will become apparent to those skilled in the art from 

"*\ this disclosure, including the following detailed description, as 

well as by practice of the invention. While the invention is 
nj 15 described below with reference to preferred embodiment ( s ) , it 

should be understood that the invention is not limited thereto. 
:5 Those of ordinary skill in the art having access to the teachings 

Hj herein will recognize additional implementations, modifications, 

j ]j and embodiments, as well as other fields of use, which are within 

Q 20 the scope of the invention as disclosed and claimed herein and with 

respect to which the invention could be of significant utility. 

Summary Disclosure of the Invention 

According to the present invention, an encoder for compressing 
image information includes a memory and processor. The memory is 

25 configured to store a sequence of characters representing an image. 
The processor is configured to determine if the stored sequence of 
characters corresponds to a banded image, such as a segment or 
slice across the entire image, or a page image, such as one of 
multiple separate images making up the entire image. The processor 

30 operates in a first mode to encode the stored sequence of 
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characters, if the stored sequence of characters is determined to 
correspond to the banded image. The processor operates in a second 
mode, different than the first mode, to encode the stored sequence 
of characters, if the stored sequence of characters is determined 
5 to correspond to the page image. 

Preferably, the processor encodes the stored sequence of 
characters in accordance with a pack-bit compression technique in 
the first mode of operation and in accordance with an LZW 
compression technique in the second mode of operation . 
10 Beneficially, the processor is also configured to encode the stored 
sequence of characters in accordance with a pack-bit compression 
technique in the second mode of operation, as this can be 
beneficial in some cases. 



15 determined to correspond to the page image, the processor is 
further configured to determine if the stored sequence of 
characters corresponds to a primarily white page image or a 
primarily black page image, which might be the case for a template 
type page image. If so, the processor encodes the stored sequence 

20 of characters in accordance with a first compression technique, 
e.g. a pack-bit compression technique, while operating in the 
second mode of operation. If not, the processor encodes the stored 
first sequence of characters in accordance with a second 
compression technique, different than the first compression 

25 technique, e.g. an LZW compression technique, while operating in 
the second mode of operation. 

In one practical implementation, an imaging system may include 
a raster image processor which determines if a sequence of 
characters corresponds to a banded image or a page image. The 

30 raster image processor then operates in a first mode to encode the 
sequence of characters if the sequence of characters is determined 
to correspond to the banded image, and to operate in a second mode, 



Advantageously, if the stored sequence of characters is 
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different than the first mode, to encode the sequence of characters 
if the sequence of characters is determined to correspond to the 
page image. 

An imager controller receives the encoded sequence of 
5 characters. The image controller then operates in either a first 
mode or second mode to decode the received encoded sequence of 
characters back into the unencoded sequence of characters. More 
particularly, the controller operates in a first mode if the 
encoded sequence of characters corresponds to the banded image, and 

10 in a second mode if the encoded sequence of characters corresponds 
to the page image. 

Preferably, the raster image processor encodes the sequence of 
characters in accordance with a pack-bit compression technique in 
the first mode of operation, and in accordance with an LZW 

15 compression technique in the second mode of operation. 
Beneficially, the raster image processor is also capable of 
encoding the sequence of characters in accordance with a pack-bit 
compression technique in the second mode of operation. 

In accordance with other aspects of the invention, if the 

20 first sequence of characters is determined to correspond to the 
page image, the raster image processor determines if the sequence 
of characters corresponds to a primarily white page image or a 
primarily black page image. If so, the raster image processor 
encodes the sequence of characters in accordance with a first 

25 compression technique, such as a pack-bit technique, while 
operating in the second mode of operation. If not, the raster 
image processor encodes the sequence of characters in accordance 
with a second compression technique which is different than the 
first compression technique, such as a LZW technique, while 

30 operating in the second mode of operation. 
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Brief Description of Drawings 

Figure 1 depicts an exemplary simplifies depiction of an image 
processing system in accordance with a first embodiment of the 
present invention. 

Figure 2 depicts an exemplary simplifies depiction of an image 
processing system in accordance with a second embodiment of the 
present invention . 

Figure 3 depicts an exemplary code dictionary in accordance 
with the second embodiment of the present invention. 



Best Mode for Carrying out the Invention 

In pre-press imaging, particularly for the flats having an 
entire plate worth of image information, most of the data is often 
either solid black or solid white, and hence digitally represented 
in binary form by all l's or O's. For halftone images all of the 
data is black and white. 

Figure 1 is a somewhat simplified, exemplary depiction of a 
image processing system 1000 according to a first embodiment of the 
present invention. The system 1000 includes a raster image 
processor (RIP) 1050, which includes a processor 1050a and a memory 
1050b for storing processing instructions and other data as 
required. The RIP 1050 receives image data and converts the image 
data into encoded data. The image data is then transmitted to an 
imager control processor 1100, which includes a processor 1100a and 
a memory 1100b for storing processing instructions and other data 
as required. The controller 1100 generates control signals to the 
imager 1150 in accordance with the data received from the RIP 1050, 
to control the imager 1150. More particularly, the control signals 
from the processor 1100a control the operation of the imager 
scanning assembly 1150a so as to form the image on a medium 1150c, 
such as a film or plate, supported within the imager 1150. As 
shown, the imager includes a cylindrical drum 1150b for supporting 
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the medium 1150c, but could alternatively include a flat bed or 
external drum for supporting the medium. 

In a first mode of operation, which will hereafter be referred 
to as a flat banding mode, the RIP 1050 receives an 80 inch x 50 
5 inch color separated image having 2400 dots per inch. Preferably 
using imposition software on a front end preprocessor (not shown) , 
the image could, for example, correspond to multiple pages of a 
magazine. In such a case, the image is formatted such that the 
image printed from the imaged medium 1150c is positioned so as to 
10 facilitate cutting, folding and stitching to create multiple 
properly printed and positioned magazine pages. In any event, the 
RIP 1050 converts the entire image into multiple gigabytes of 
CI encoded data as a single job. 

^ However, due to processing power limitations of RIP 1050, the 

^] 15 entire image cannot be converted into encoded data in a single 
jlj operational process. Accordingly, the image is sliced into bands, 

~\ prior to being converted, typically by the RIP 1050. If converted 

i; by the RIP 1050, this banding of the image may be performed by the 

jlj RIP processor 1050a. However, conversion could also be preformed 

nl 

[11 20 outside the RIP, for example by a preprocessor (not shown) . In the 

Cl preferred embodiment, the RIP processor 1050a converts the image 

.S3 a, 
'« I 

data representing each of the bands into encoded data in a separate 
operational process. Thus, the job is completed only after the 
multiple separate operational processes are performed by the RIP 

25 1050 so as to convert all of the image data representing the bands 
for the entire image into encoded data. In practice, the larger the 
image the smaller is each image band, with all band preferably 
being equal in size. Furthermore, the larger the image the greater 
the startup time required before beginning the conversion of the 

30 image data to encoded data, because the larger the image, the more 
pre-conversion processing required. Additionally, the more objects 
included in the image, the more memory that is required. 
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In a second mode of operation, sometimes referred to as a page 
assembly mode, the RIP 1050 receives, as multiple images, an 80 
inch x 50 inch color image having 2400 dots per inch. In this 
case, one of the multiple images, which might be characterized as a 
5 template image, includes information such as registration marks, 
color gradients, and identification marks, but is primarily white. 
Each of the other of the multiple images could, for example, be the 
image for a separate page of a magazine. Here, the RIP 1050 may be 
operated to convert the image data representing the entire template 
10 image into encoded data as one job and to convert all of image data 
representing the other of the multiple images into image data in 
another job. When fully converted, the multiple images will be 
Q represented by encoded data. 

%] More particularly, in the page assembly mode, the image is 

l p 15 divided into page assemblies, one of which is a primarily white 
HI template image which is typically processed by the RIP processor 

. rl 

jjj 1050a without being split into bands. The other of the multiple 

; 5 images are, however, typically sliced into bands prior to being 

il) encoded by the RIP 1050. Because the area of each of the other 

[H 20 multiple images is much smaller than the area of the entire image 

'at f 

Q discussed with reference to the first mode of operation, fewer 

bands are required and, as a whole, it will take less time to 
convert the image data representing the multiple images into 
encoded data in the page assembly mode than the time required to 

25 convert the image data representing entire image into image data in 
the banding mode discussed above. Thus, the RIP processor 1050a 
converts the image data for each of the bands for each of the other 
multiple images into encoded data in a separate operational 
process. The job, or jobs if the template image is pre-processed, 

30 is completed only after the multiple operational processes are 
performed to convert all of the image data representing the 
multiple images forming the entire image into encoded data. 
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Although, the image discussed with reference to the page 
assembly mode may be the same as the image discussed with reference 
to the prior page assembly mode, conversion in the page assembly 
mode will typically result in even a greater amount of encoded data 
5 than the conversion in the previously discussed banding mode. For 
example, two gigabytes of encoded data may be generated by the RIP 
1050 to represent the image in the banding mode, while three 
gigabytes of image data could be generated by the RIP 1050 to 
represent the same image in the page assembly mode because there 
10 would be more uncompressed data. Further, whether the banding or 
page assembly modes are utilized by the RIP 1050, the entire image 
cannot be converted into encoded data in a single operational 
process due to processing power limitations of the RIP 1050. 



15 converted using a LZW technique. In the page assembly mode, a 
template image, of say 16 megabytes, may be satisfactorily 
converted using a PB technique. However, the PB technique will 
often produce unsatisfactory results if used to convert the bands 
of the other of the multiple images. Accordingly, in the page 

20 assembly mode, these bands are converted using an LZW technique. 
Thus, in the page assembly mode, different compression techniques 
are utilized for a single image and perhaps even in a single job. 

Accordingly, in the first embodiment of the present invention, 
the RIP 1050 is selectively operable in either the banding or the 

25 page assembly mode operation. Hence, in operation, the RIP 1050 
initially scans the received image data representing the image, or 
image bands if the bands are sliced during pre-processing, to 
determine if banding mode or page assembly mode operations, is 
required. If it is determined that banding mode operation is 

30 required, the RIP 1050 implements an LZW technique to convert the 
image data into encoded data. If, on the other hand, it is 
determined page assembly mode operation is required, the RIP 1050 



In the banding mode, the image bands may be satisfactorily 



13 



DOCKET NO: 3175-51A 



PATENT 



CLIENT NO: XP-0898A 

further determines if the page image data represents a template 
image or banded image. If it is determined that the page image data 
represents a template image, the RIP 1050 implements a PB technique 
to convert the template image into encoded data. If, however, it 
is determined that the page image data represents a banded image, 
the RIP 1050 implements a LZW technique to convert the banded image 
data into encoded data. The selective operation of the RIP 1050, 
depending on the received image data facilitates the more efficient 
and effective processing of different types of large images than 
has been previously obtainable in conventional RIPs. 

According to a second embodiment of the present invention, as 
a stream of sequential data is processed prior to encoding if, at 
the start of the sequence, the immediately preceding character, 
which is yet to be encoded, matches the next character in the 
stream and this next character is either solid black or solid 
white, and hence digitally represented in binary form by all l's or 
0's, encoding is interrupted. During the interruption, a 
determination is made as to whether the one or more characters, 
immediately following the next character in the sequence, also 
match the next character. 

The second embodiment of the invention will now be described 
with reference to Figure 2. As shown, Figure 2 represents a 
somewhat simplified, exemplary depiction of an image processing 
system 2000. The system 2000 includes a raster image processor 
(RIP) 2050, which receives an image and converts the image into 
encoded data. The encoded data is then transmitted to imager 
controller 2100, which generates control signals to the imager 1150 
in accordance with the encoded data received from the RIP 2050 to 
control the imager 1150 after decoding the received data. This 
imager 1150 is identical to the image 1150 of Figure 1. More 
particularly, the control signals from the imager controller 
processor 2100a control the operation of the imager scanning 
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assembly 1150a so as to form the image on a medium 1150c, which 
could be identical to the medium 1150c in Figure 1. The medium 
1150c is supported within the imager 1150 of Figure 2. As shown, 
the imager 1150 includes a cylindrical drum 1150b for supporting 
5 the medium 1150c. 

In the second embodiment of the present invention, the RIP 
processor 2050a implements a compression technique, which will 
hereafter be referred to as the AGFA compression technique. Using 
the AGFA compression technique, variable length of strings of byte 

10 based data can be processed. Processing using the AGFA technique 
will be substantially faster for many large image applications than 
the LZW compression techniques, while still providing satisfactory 
results for color images as well as those which are primarily black 
and white. Further, the AGFA technique is not limited to single 

15 bytes of data, and is therefore capable of compressing data on an 
arbitrary pixel or bit boundary basis. Additionally, the AGFA 
technique is capable of providing a higher compression rate than 
both PB and LZW compression techniques. 



20 data as it would appear entering a RIP processor 2050a prior to 
encoding could, for example, include the string of characters 
"abcO. . . .OlcOl". The string "0....0" is a large string of zero's, 
for example representing image information for 32k pixels. 



25 processors, i.e. the RIP processor 2050a and imager controller 
processor 2100a, must coordinate on the transmission and receipt of 
codes, similar to the coordination required by LZW techniques. 
However, as will be described further below, the AGFA technique 
uses a compression dictionary containing four pre-defined 

30 compression codes. The characters in the input string are scanned 
to determine if a scanned sub-string of characters match certain of 
these pre-defined compression codes. If so, the matching sub- 



An exemplary representation of a stream of sequential input 



Using the AGFA technique, the encoding and decoding 
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string of characters is encoded with the applicable pre-defined 
compression code. If a sub-string of characters does not match a 
pre-existing code, new codes corresponding to the sub-strings are 
added to the dictionary. 
5 Further, the AGFA technique provides a look-ahead function, in 

which to determine whether or not the sub-string is greater than a 
minimum number, preferably 6, bytes, and if so the sub-string is 
encoded with a new code, which includes any applicable pre-existing 
code, and the length of the code field. The length is the width of 

10 the pre-existing code, with this code forming the most significant 
bits and serving as a continuation indicator, and any new coding, 
with this coding forming the least significant bits. Like LZW 
techniques, sub-strings are initially defined by codes having 9 
bits or digits, but may be increased to up to 12 bits to add new 

15 codes. Once the 12 bit limit is exceeded, the dictionary is reset 
and subsequent codes are again defined initially with 9 bits. 

Referring to Figure 3, in the AGFA technique, four codes are 
predefined and stored in the code dictionary 3000 on RIP memory 
2050b as codes 1330. In the present example these codes are the 

20 code 100, representing a sub-string of all zero bytes which 
corresponds to white, code 101, representing a sub-string of all 
one bytes which corresponds to black, code 102, representing a 
reset, and the code 103, representing an end of the compressed 
encoded data. In the present example, codes 104, 105, and 106 etc. 

25 represent sub-strings of new patterns which are generated during 
the processing of the input date and also stored on RIP memory 
2050b in code dictionary 3000. It will be recognized that because 
codes for the strings corresponding to white and black are 
paritally predefined, reduced processing is required to generate 

30 these codes, since the predefined codes can simply be read by RIP 
processor 2050a from the dictionary codes as required. 

Using the AGFA technique, the RIP processor 2050a first sets a 
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reset code 102 read from the code dictionary 3000 and reads the "a" 
in the sequence and the "b" immediately thereafter in the sequence. 
The RIP processor 2050a then determines from code dictionary 3000, 
if a code exists for the character sequence "ab". Since, in this 
5 example, no such code exists at this point in the processing, a new 
code 105 is generated to represent the new pattern string "ab" and 
stored in the code dictionary 3000 on memory 2050b. The RIP 
processor 2050a continues by reading the "c" immediately following 
the "b" in the sequence. The RIP processor 2050b determines if a 

10 code exists for the character sequence "be". Since, in this 
example, no such code exist at this point in the processing, a new 
code 106 is generated to represent the new pattern string "be" and 
stored in dictionary 3000. 

The RIP 2050 continues by reading the "0" immediately 

15 following the "c" in the sequence. The RIP processor 2050a 
determines if a code exists for the character sequence "c0". 
Since, in this example, no such code exists at this point in the 
processing, a new code 107 is generated to represent the new 
pattern string u c0" and stored in dictionary 3000. Also, because 

20 the "0" is recognized as special, the RIP processor 2050a, 
automatically scans ahead to read the next character in the 
sequence to determine if it matches with the initial "0" in the 
sequence. If not, the scanning ahead is immediately discontinued 
and the RIP processor 2050a proceeds with normal processing. If so, 

25 the scanning ahead continues on a character by character basis 
until no match with "0" is found, at which point the scanning ahead 
is discontinued and normal processing continues. 

In this exemplary application of the AGFA technique, the RIP 
processor 2050a scans ahead and counts the number of "0" or "1" 

30 bytes in the sequence. Preferably, a compression threshold is pre- 
established and stored on the RIP memory 2050b. For example, the 
threshold might correspond to a 4 to 1 compression rate. If such a 
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threshold is utilized and the number of "0" or "1" bytes counted is 
less than the number required to meet or exceed the threshold, e.g. 
if the sequence consist of only one or two zeros or ones, then a 
new code would be established for the sequence in the normal 
5 manner. Only if the number of "0" or "1" bytes counted meets or 
exceeds the threshold, is the sequence encoded using the applicable 
pre-defined code 100 or 101. 

Assuming in the present example that the number of "0" bytes 
counted by the RIP processor 2050a meets or exceeds the threshold, 
10 the count is determined to be a repeat count. Either 9, 10, 11, or 
12 bits can be used to code the repeat count. However, if the count 
is so great that more 11 bits would be required for the encoding, a 
□ continue code which may be generated by processor 2050a or 

. !? % 

'is T 

sj retrieved from memory 2050b, is inserted as the least significant 

15 bit in the output code to enable the output codes representing the 
111 entire sequence of zeros or ones to be strung together, 

hj Accordingly, no matter how long the sequence, the low or less 

; J : significant bit of each output code within the string of output 

fU codes would represent an end or continuation of the coding. Hence, 

]t\ 20 1 bit is sacrificed for the end/continuation bit leaving 8, 9, 10, 
? «l or 11 bits for the repeat count. 

Accordingly, in the present example the output code for the 
repeat count of "0" characters would be formed with the code "100" 
to indicate that this is a sequence of "0" characters, followed by 
25 "102" representing a first portion of the repeat count, and "001" 
indicating that the output codes for the repeat count continues. 
Thus, the first code in the string of repeat count output codes 
would be "100102001" . The second code in the string of repeat 
count output codes could be "102001", with the "102" representing a 
30 second portion of the repeat count, and "001" indicating that the 
output codes for the repeat count continues. The last code in the 
string of repeat count output codes could be "0201". The high bit 
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of the last output code "0201" is made clear to indicate that this 
is the end of the repeat count information in this field. 

Using the repeat count multiple output codes , the strung 
together codes for the entire repeat count would, in the above 
5 example be "1001020011020010201" . Thus, the strung together 
multiple bytes of output codes provide a full representation of the 
repeat count. In practice, five output codes may be used to 
represent up to four billion characters. Notwithstanding the 
number of bits in the output codes, the high bit is used to 
10 represent the count. Accordingly, whatever output code size is 
used, full advantage is taken of all available bits for the repeat 
count . 

It is perhaps worthwhile emphasizing here that conventional 
LZW techniques lack the ability to scan ahead. Conventional PB 

15 techniques, on the other hand, scan ahead to locate matches with 
whatever character has been read and must fully generate the match 
coding for each matching sequence. In contrast, the present 
invention scans ahead to locate matches with only selective 
characters, preferably only white and black, respectively 

20 represented herein by "0" and "1". Further, the present invention 
uses a predefined code for each of the selected characters, e.g. 
white and black, and hence the match coding for each matching 
sequence need only be partially generated, since the predefined 
code, e.g. codes 100 or 101, which identifies the applicable 

25 sequence as a sequence of white or black characters is pre- 
generated and need only be read from the code dictionary 3000. 
Accordingly, the present invention is capable of providing superior 
encoding of, for example, large images using less computing 
resources and computing time. 

30 As noted above, once the RIP processor 2050a determines it is 

at the last "0" in the sequence, i.e. by determining from the 
scanning ahead on a character by character basis that a next 
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character does not match with "0", the scanning ahead is 
discontinued and normal processing continues. y Thus, the RIP 
processor 2050a continues by reading the "1" immediately following 
the last "0" in the sequence. The processor 2050a determines if a 
5 code exists for the character sequence "01". Since, in this 
example, no such code exist at this point in the processing, a new 
code 108 is generated to represent the new pattern string "01". The 
processor 2050a proceeds by reading the "c" immediately following 
the "1" in the sequence. The processor determines if a code exists 

10 for the character sequence "lc". Since, in this example, no such 
code exists at this point in the processing, a new code 109 is 
generated to represent the new pattern string w lc". 

The processor further proceeds by reading the "0" immediately 
following the second "c" in the sequence. The processor 2050a 

15 determines if a code exists for the character sequence "cO". In 
this example, such a code, i.e. code 107, does exist. The RIP 
processor 2050a also scans ahead to determine if another M 0" 
immediately follows this occurrence of "cO". Since, in this case 
the RIP processor 2050a determination is negative, the scanning 

20 ahead is discontinued and normal processing continues. 

The processor 2050a now proceeds by reading the "1" 
immediately following the second "cO" in the sequence. The 
processor 2050a determines if a code exists for the character 
sequence "c01". Since, in this example, such a code does not 

25 exist, a new code 110 is generated to represent the new pattern 
string "cOl" which can be represented as "1071", The RIP processor 
2050a also scans ahead to determine if another "1" immediately 
follows this occurrence of "cOl". Since, in this case the RIP 
processor 2050a determination is negative, the scanning ahead is 

30 discontinued and normal processing would continue if further 
characters remained to be encoded. However, since the "cOl" are 
• the final characters, encoding ends. 
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The processor ultimately generates encoded output data forming 
a sequence including the string of characters 
"102abcl0010200110200102011071103". 

Similar to LZW techniques, in the AGFA technique, the RIP 
5 processor 2050a builds a tree of numerous codes generated using 
pre-defined or other codes and thereby is capable of providing 
satisfactory results even though the processing is performed on a 
byte by byte basis to find repeating bytes. However, as compared 
to LZW techniques, in the AGFA technique, processing time and 

10 resources required by the RIP 2050 to encode large sequences is 
substantially reduced through the use of special pre-defined codes. 
The decoding processor, i.e. the imager control processor 2100a, 
which could serve as a printer controller (not shown) or be some 
other type decoding device, builds a similar tree using the codes 

15 in the code dictionary received from the encoding processor, i.e. 
the RIP processor 2050a. Basically, the decoding processor 
performs the reciprocal of the encoding process to decode the 
encoded sequence characters "102abcl0010200110200102011071103" . It 
should be understood that the encoded data could if desired be 

20 transmitted to the decoding device via a direct communications 
link, a local network, a public network such as the Internet, or 
some other type of network. Further, such communications may be by 
wire communications or wireless communications. 

It will also be recognized by those skilled in the art that, while 
25 the invention has been described above in terms of one or more 
preferred embodiments, it is not limited thereto. Various features 
and aspects of the above described invention may be used 
individually or jointly. Further, although the invention has been 
described in the context of its implementation in a particular 
30 environment and for particular purposes, e.g. imaging, those 
skilled in the art will recognize that its usefulness is not 
limited thereto and that the present invention can be beneficially 
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utilized in any number of environments and implementations. 
Accordingly, the claims set forth below should be construed in view 
of the full breath and spirit of the invention as disclosed herein. 
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